Parallel Query Processing
نویسندگان
چکیده
With relations growing larger and queries becoming more complex, parallel query processing is an increasingly attractive option for improving the performance of database systems. The objective of this paper is to examine the various issues encountered in parallel query processing and the techniques available for addressing these issues. The focus of the paper is on the join operation with both sort-merge join and hash joins being considered. Three types of parallelism can be exploited, namely intra-operator, inter-operator, and inter-query parallelism. In intra-operator parallelism the major issue is task creation, and the objective is to split a join operation into tasks in a manner such that the load can be spread evenly across a given number of processors. This is a challenge when the values on the join attribute are not uniformly distributed. Inter-operator parallelism can be achieved either through parallel execution of independent operations or through pipelining. In either case, the major issues are the join sequence selection, which determines the precedence relations among the operations and the pipeline structures that can be exploited, and the processor allocation for each operation. For inter-query parallelism, the issue again is processor allocation, but at the query level. Various techniques to address each of these issues have been studied in the past, albeit under diierent assumptions and generally focussing on only one of the issues. In this paper, we explore a method for obtaining inter-query parallelism based on a hierarchical approach and a uniied framework, so that the potential integration of the techniques used to address each type of parallelism can be illustrated.
منابع مشابه
Non-zero probability of nearest neighbor searching
Nearest Neighbor (NN) searching is a challenging problem in data management and has been widely studied in data mining, pattern recognition and computational geometry. The goal of NN searching is efficiently reporting the nearest data to a given object as a query. In most of the studies both the data and query are assumed to be precise, however, due to the real applications of NN searching, suc...
متن کاملDynamic Aspects of Query Processing in Parallel Database Systems
This paper reports ongoing work in developing a query representation method for the implementation of dynamic and parallel query processing in database systems. We present Stream Processing based Query Representation | a concept derived from software engineering | as powerful approach, which covers query representation from high-level query declaration to low-level procedural (parallel) executi...
متن کاملانتخاب مناسبترین زبان پرسوجو برای استفاده از فراپیوندها جهت استخراج دادهها در حالت دیتالوگ در سامانه پایگاه داده استنتاجی DES
Deductive Database systems are designed based on a logical data model. Data (as opposed to Relational Databases Management System (RDBMS) in which data stored in tables) are saved as facts in a Deductive Database system. Datalog Educational System (DES) is a Deductive Database system that Datalog mode is the default mode in this system. It can extract data to use outer joins with three query la...
متن کاملEffect of Inverted Index Partitioning Schemes on Performance of Query Processing in Parallel Text Retrieval Systems
Shared-nothing, parallel text retrieval systems require an inverted index, representing a document collection, to be partitioned among a number of processors. In general, the index can be partitioned based on either the terms or documents in the collection, and the way the partitioning is done greatly affects the query processing performance of the parallel system. In this work, we investigate ...
متن کاملOptimization Strategies for Parallel Linear Recursive Query Processing
Query optimization for sequential execution of non-recursive queries has reached a high level of sophistication in commercial DBMS. The successful application of parallel processing for the evaluation of recursive queries will require a query optimizer of comparable sophistication. The groundwork for creating this new breed of query optimizer will consist of a combination of theoretical insight...
متن کاملExploiting Reconfigurable FPGA for Parallel Query Processing in Computation Intensive Data Mining Applications
This work concentrates on exploiting re-configurable Field Programmable Gate Arrays (FPGAs), an SRAM-based FPGA coprocessor, for query processing in computation-intensive data mining applications. Complex computation-intensive data mining applications in geoscientific and medical information systems environments often require support for extensibility and parallel processing to deliver the nece...
متن کامل